Estimation of genomic breeding values using the Horseshoe prior
نویسنده
چکیده
BACKGROUND A method for estimating genomic breeding values (GEBV) based on the Horseshoe prior was introduced and used on the analysis of the 16(th) QTLMAS workshop dataset, which resembles three milk production traits. The method was compared with five commonly used methods: Bayes A, Bayes B, Bayes C, Bayesian Lasso and GLUP. METHODS The main difference between the methods is the prior distribution assumed during the estimation of the SNP effects. The distribution of the Bayesian Lasso is a Laplace distribution; for Bayes A is a Student-t; for Bayes B and Bayes C is a spike and slab prior combining a proportion of SNP without effect and a proportion with effect distributed as a Student-t or Gaussian for Bayes B and C, respectively; for GBLUP is similar to a ridge regression. The distribution for the Horseshoe prior behaves like log(1+1/β(2)) (up to a constant). It has an infinite spike at zero and heavy tail that decay by β(-2) (slower than the Laplace or the Student-t). The implementation of all methods (except GBLUP) was done using a MCMC approach, where the relevant parameters defining the prior distributions were jointly estimated from the data. The GBLUP was done using ASREML. RESULTS The accuracy for all methods ranged from 0.74 to 0.83, representing an improvement of 44% to 78% over the traditional BLUP evaluation. GEBV with the highest accuracy were obtained with Bayes A, Bayes B and the Horseshoe prior. The Horseshoe tended to select smaller number of SNP and assigning them larger effects, while strongly shrinking the remaining SNP to have an effect closer to zero. CONCLUSIONS The Horseshoe prior showed a different shrinkage pattern than the other methods. While for this specific dataset, this has little impact on the accuracy of the GEBV, it may prove a good property to discriminate true effect from noise, and thereby, improve overall prediction under different scenarios.
منابع مشابه
Effect of Markers Effect Estimation Methods, Population Structure and Trait Architercture on the Accuracy of Genomic Breeding Values
This study aimed to investigate the effect of the method of estimating the effects of markers , QTLs distribution, number of QTLs, effective population size and trait heritability on the accuracy of genomic predictions. Two effective population sizes, 100 and 500 individuals, were simulated by QMSim software. A 100 cM genome including one chromosome was simulated where 500 SNPs and two diffe...
متن کاملبرآورد صحت انتخاب ژنومی در جوامع کوچک ژنتیکی- مطالعه شبیهسازی
In the present study two genetically connected small and large populations were simulated and the effect of different sources of information from foreign populations on the accuracy of predicted genomic breeding values of young animals of the small population was investigated. A large population consist of 200000 animals over 15 generations and a small population consist of 5000 animals over 3 ...
متن کاملComparing Different Marker Densities and Various Reference Populations Using Pedigree-Marker Best Linear Unbiased Prediction (BLUP) Model
In order to have successful application of genomic selection, reference population and marker density should be chosen properly. This study purpose was to investigate the accuracy of genomic estimated breeding values in terms of low (5K), intermediate (50K) and high (777K) densities in the simulated populations, when different scenarios were applied about the reference populations selecting. Af...
متن کاملA Comparison of the Sensitivity of the BayesC and Genomic Best Linear Unbiased Prediction(GBLUP) Methods of Estimating Genomic Breeding Values under Different Quantitative Trait Locus(QTL) Model Assumptions
The objective of this study was to compare the accuracy of estimating and predicting breeding values using two diverse approaches, GBLUP and BayesC, using simulated data under different quantitative trait locus(QTL) effect distributions. Data were simulated with three different distributions for the QTL effect which were uniform, normal and gamma (1.66, 0.4). The number of QTL was assumed to be...
متن کاملComparison of Single and Multi-Step Bayesian Methods for Predicting Genomic Breeding Values in Genotyped and Non-Genotyped Animals- A Simulation Study
The purpose of this study was to compare the accuracy of genomic evaluation for Bayes A, Bayes B, Bayes C and Bayes L multi-step methods and SSBR-C and SSBR-A single-step methods in the different values of π for predicting genomic breeding values of the genotyped and non-genotyped animals. A genome with 40000 SNPs on the 20 chromosom was simulated with the same distance (100cM). The π valu...
متن کامل